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ABSTRACT 



When two or more different video streams a e compressed 
for concurrent transmission of multiple compressed video 
bitstreams over a single shared communication channel, 
control over both (1) the transmission of data over the shared 
channel and (2) the compression processing that generates 
the bitstreams is exercised taking into account the differing 
levels of latency required for the corresponding video appli- 
cations. For example, interactive video games typically 
require lower latency than other video applications such as 
video streaming, web browsing, and electronic mail. A 
multiplexer and traffic controller takes these differing 
latency requirements, along with bandwidth and image 
fidelity requirements, into account when controlling both 
traffic flow and compression processing. In addition, an 
off-line profiling tool analyzes typical video applications 
off-line in order to generate profiles of different types of 
video applications that are then accessed in real-time by a 
call admission manager responsible to controlling the admis- 
sion of new video appli cation sessions as well as the 
assignment of admitted applications to specific available 
video encoders, which themselves may differ in video com- 
pression processing power as well as in the degree to which 
they allow external processors (like the multiplexer and 
traffic controller) to control their internal compression pro- 
cessing. 

40 Claims, 4 Drawing Sheets 
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LATENCY-BASED STATISTICAL 
MULTIPLEXING 

CROSS-REFERENCE TO RELATED 
APPLICATIONS 

This application claims the benefit of the filing dates of 
U.S. provisional application No. 60/114,834, filed on Jan. 6, 
1999, U.S. provisional application No. 60/114,842, filed on 
Jan. 6, 1999, and U.S. provisional No. 60/170,883, filed on 
Dec. 15, 1999, using U.S. Express Mail Label No. 
EL416189565US. 

BACKGROUND Or THE INVENTION 

1 . Field of the Invention 

The present invention relates to the compression and 
transmission of video signals, and, in particular, to the 
compression and transmission of multiple compressed video 
streams over a single, shared communication channel. 

2. Description of the Related Art 

Whenever two or more different video applications share 
a single communication channel having a finite bandwidth, 
management of the allocation of that bandwidth to those 
different applications needs to be performed, at least at some 
level. In a fixed multiplexing scheme, each application is 
assigned a fixed — although possibly different — allocation of 
the total available bandwidth, where the sum of the fixed 
bandwidth allocations is not greater than total channel 
bandwidth. Fixed multiplexing schemes are appropriate for 
video applications having constant or at least relatively 
constant bit rates, or for situations in which the total avail- 
able channel bandwidth is greater than the sum of the 
maximum bandwidth requirements for all of the video 
applications. 

Many video applications, on the other hand, have variable 
bit rates. A conventional MPEG-2 video encoder, for 
example, encodes sequences of video images by applying a 
repeating pattern of frame types referred to as the group of 
picture (GOP) structure. For example, a typical 15-frame 
GOP structure may be (1BBPBBPBBPBBPBB), where I 
represents a frame encoded using only intra-frame encoding 
techniques, P represents a frame encoded using inter-frame 
encoding techniques based on a previous anchor (i.e., the 
previous I or P) frame, and B represents a frame encoded 
using inter-frame encoding techniques based on either a 
previous anchor frame (forward prediction), a subsequent 
anchor frame (backward prediction), or an average of pre- 
vious and subsequent anchor frames (interpolated 
prediction). B frames are never used as anchor (i.e., 
reference) frames for encoding other frames. 

In typical video sequences, I frames require significantly 
more bits to encode than P and B frames. In addition, since 
predictive encoding schemes, like MPEG-2, take advantage 
of similarities between frames, frames associated with scene 
changes in video imagery, where frame-to-frame similarity 
is often low, will also typically require more bits to encode 
than those frames in the middle of a scene. As such, the 
compressed video bitstream for a typical video sequence 
encoded based on a video compression scheme like 
MPEG-2 that relies on a relatively steady GOP structure will 
have a variable bit rate profile typically consisting of rela- 
tively narrow "peaks" of high bit rate corresponding to I 
frames and/or scene changes, separated by relatively wide 
"valleys" of lower, more uniform bit rate corresponding to 
sequences of P and B frames. 

For such non-uniform bit-rate video applications, fixed 
multiplexing schemes which allocate bandwidth based on 
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peak bit-rate requirements may be inefficient, because most 
of the time (i.e., the time corresponding to the lower bit -rate 
valleys), any given video application will not be using its 
full allocation of bandwidth. For such applications, statisti- 
5 cal multiplexing may be applied to improve the efficiency of 
bandwidth usage. 

Statistical multiplexing can be defined as: 

(a) the control required for allocation of bits in proportion 
to the complexity and importance of each video appli- 

10 cation within the limits of control allowed by each 
video encoder, such that: 

(i) the aggregate instantaneous bit rate is less than or 
equal to the channel capacity; 

(ii) the minimum quality of service (QoS) requirements 
for all applications are met; and 

15 (iii)the quality is maximized for applications in the 
order of their importance; and 

(b) the control required in pathological cases, where the 
aggregate instantaneous bit rate is greater than the 
channel capacity, to minimize the loss in QoS for as 

20 minimal a number of applications as possible. 

To achieve these levels of control, statistical multiplexing 
takes into account the variations in bit rate of different video 
applications when allocating transmission bandwidth. 
Statistical multiplexing schemes often involve the imple- 

25 mentation of a dynamic bandwidth manager that controls the 
allocation of bandwidth to the various video applications in 
real time. Such bandwidth managers are able to monitor the 
real-time bit-rate demands of the different video applications 
to control the transmission of data from those different 

30 applications over the shared communication channel. 

For conventional video applications, such as video 
streaming which involves the one-way transmission of a 
compressed video bitstream from a video server to one or 
more remote users, the quality of service depends on the 

35 fidelity and uniformity of the video playback, where collec- 
tively high fidelity and high uniformity typical mean (1) 
uniform, full frame rates and (2) uniform high image quality 
both within each frame and between consecutive frames. For 
these applications, the end-to-end latency involved in the 

40 processing is of less importance. As such, the primary 
concern of bandwidth managers for conventional statistical 
multiplexing schemes involving conventional video appli- 
cations is to ensure that there will always be sufficient data 
in the receiver buffer at each user node to provide high 

45 fidelity, uniform video playback to each user. 

High levels of latency, however, are not acceptable for all 
video applications. Many interactive video applications, 
such as video conferencing and distributed video games 
where two or more remotely located users compete against 

50 each other, require relatively low levels of latency — in 
addition to high levels of uniformity and fidelity — for 
acceptable QoS levels. Moreover, in many multiplexing 
situations, different video applications will have different 
latency requirements. Furthermore, the latency requirements 

55 of even some individual video applications, such as web 
browsing, may vary over time, when the type of video 
service changes during the application session. For all these 
situations, conventional multiplexing schemes — even con- 
ventional statistical multiplexing schemes — will not provide 

60 acceptable QoS levels, because they do not take into account 
the different and varying levels of latency required by the 
different video applications being multiplexed for transmis- 
sion over a shared communication channel. 

65 SUMMARY OF THE INVENTION 

The present invention is directed to statistical multiplex- 
ing schemes that do take into account the corresponding 
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latency requirements of different video applications (in ing detailed description, the appended claims, and the 

addition to other factors such as uniformity and fidelity of accompanying drawings in which: 

video playback) when managing the bandwidth of a shared pjQ j snows a diagram of a video processing 

communication channel. According to embodiments of the system , according to one embodiment of the present inven- 

p resent invention, the statistical multiplexing takes latency 5 t j 0Q . 

into account to provide (a) traffic control (i.e., the control of ^' „ t , . . ,. c 

how the data for multiple compressed video bitstreams is FIG - 2 shows an assumcd P*<* W1SC linear cost Unction 

transmitted over the shared communication channel) as well based on latency; 

as (b) some level of control over the actual compression FIG. 3 shows a system -level block diagram of computer 

processing used to generate those bitstreams for the different ]Q system, according to one embodiment of the present inven- 

video applications. tion; 

According to one embodiment, the present invention is a FIG, 4 shows a board-level block diagram of each encoder 

method for controlling transmission over a shared commu- board 0 f me compu ter system of FIG. 3; and 

nication channel of multiple compressed video bitstreams board-level block diagram of the statistical 

generated by a plurality of video encoders and correspond- t * ,i . r ™^ <, 

mg to multiple video applications, comprising the steps of 35 multiplexing board of the computer system of FIG. 3. 

(a) receiving information for each compressed video bit- hftatt pn np^PBiPnniM 
stream wherein at least two of the video applications have uhiaiuili u^CKiruuiN 
different latency requirements; (b) controlling the transmis- piG. 1 shows a block diagram of a video processing 
sion of data from the multiple compressed video bitstreams system 10 o, according to one embodiment of the present 
over the shared communication channel taking into account 20 mvention video pr0C essing system 100 compresses mul- 
thc information for each compressed video bitstream and the ti k yideo streams corresponding to different video appli- 
latency requirement of each corresponding video apphca- for transmission over a single shared commumcat ion 
tion; and (c) adaptively controlling compression processing chanDel n6 ^ videQ Ucations mdude 
of at least one of the video encoders taking into account the u , , , . 4 . c ~ \\ r j v 
information for the corresponding compressed video bit- M smtable combinat.on of different type of v.deo apphca- 
stream and the latency requirement of the corresponding ' I0DS deluding v,de 0 conferencing mteracuve vxdeo games 
video application having different levels of sophistication, web browsing, and 
According to another embodiment, the present invention e [ ectr ° nic mail ". Spending on the implementation the 
is a video processing system for controlling transmission of sharcd communication channel may be any suitable trans- 
multiple compressed video bitstreams corresponding to mul- 30 mis sion P ath that supports the concurrent transmission of 
tiple video applications over a shared communication multiple data streams, such as Ethernet, TCP/IP, Broadband 
channel, comprising (a) a plurality of video encoders, each networks, satellite, cable transmission, ADSL, DSL, and 
configured to generate a different compressed video bit- cable modem. 

stream for a different video application, wherein at least two [ a particular, one or more application server 102 provide 

of the video applications have different latency require- ^ multiple video streams to a service admission manager 104, 

ments; and (b) a controller, configured to (1) receive the wn j cn manages the admission of new video application 

compressed video bitstreams from the video encoders; (2) seS sions onto the system. In response to a request for 

control the transmission of data from the multiple com- admission by a new video application (received from appli- 

pressed video bitstreams over the shared communication cation ^ 1Q6) servke admission man ager 104 

channel taking into account information for each com- (a ) determines whether to accept the request and admit the 

pressed video bitstream and he latency requirement of each *a ^ ^ Hcation and if £ (b) ^ me new video 

corresponding video application; and (3 adaptively control lication ^ appropriate vid i ; ncoder . 

the compression processing of at least one of the video ^' ^ F 

encoders taking into account the information for the corre- As indicated in FIG. 1, video processing systems in 

sponding compressed video bitstream and the latency accordance with the present mvention have multiple video 

requirement of the corresponding video application. 45 encoders available to perform the required video compres- 

According to yet another embodiment, the present inven- sion processing for the different video applications, where 

tion is a controller for controlling transmission of multiple different video encoders may provide different levels of 

compressed video bitstreams corresponding to multiple video compression processing power (e.g., in terms of frame 

video applications over a shared communication channel, in rate and image fidelity). In general, differing levels of video 

a video processing system further comprising a plurality of 50 compression processing power make these different video 

video encoders, each configured to generate a different encoders more or less suitable for different video applica- 

compressed video bitstream for a different video application, tions having differing bandwidth and latency requirements, 

wherein at least two of the video applications have different High-demand video applications, such as high-end interac- 

latency requirements, wherein the controller is configured to tive video games , typically have high bandwidth and low 

(1) receive the compressed video bitstreams from the video $s kteQcy requirements . At the other end of the spectrum, 

encoders; (2) control the transmission of data from the low _d ema nd video applications, such as web browsing and 

multiple compressed video bitstreams over the shared com- dectronic mail> t ica1l have low bandwid th an d high 

mumcatior, i channel taking into account information for each requirements. In between are video applications, 

compressed video bitstream and the latency requirement of .1. • j • j e • *u . * • 

1 j- * j r 4 - . i„ such as video streammg and video conferencing, that typi- 

each corresponding video application and (3) adaptively „ . . 4 t * , . , . /r . 

control the compression processing of at least one of the «> «lly have intermediate to high bandwidth requirements and 

video encoders taking into account the information for the ^rmediate to low latency requirements, 

corresponding compressed video bitstream and the latency In addition to video compression processing power, video 

requirement of the corresponding video application. encoders may also differ in the degree to which external 

processors are able to control the details of their internal 

BRIEF DESCRIPTION OF THE DRAWINGS 65 compression processing. For example, some video encoders 

Other aspects, features, and advantages of the present may provide external control only at the frame level (e.g., in 

invention will become more fully apparent from the follow- terms of specifying target bit rates and/or average quanti- 
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zation levels per frame). Other video encoders may also 
provide external control at the sub-frame level (e.g., in terms 
of specifying target bit rates and quantization levels at the 
slice or even rnacroblock level within each frame). 

Although video compression processing power and the 
degree of external control over internal compression pro- 
cessing are technically both continuous and independent 
parameters, video encoders can be grouped into three basic 
classes, as shown in FIG. 1. 

Class 1 encoders 108 provide relatively high levels of 
video compression processing power (e.g., in terms of high 
frame rates and high image fidelity), while providing rela- 
tively low levels of external control over their internal video 
compression processing. Class 1 video encoders, such as 
typical hardware encoders, are suitable for video applica- 
tions requiring both high bandwidth and low latency, such as 
high-end interactive video games. 

Class 2 encoders 110 provide slightly lower levels of 
video compression processing power than Class 1 encoders 
108, but higher levels of external control over their internal 
video compression processing. Class 2 video encoders, 
which are typically high-end software encoders, are suitable 
for (a) video applications requiring slightly lower bandwidth 
and/or slightly higher latency, such as video streaming 
applications and low-end interactive video games. 

Lastly, Class 3 encoders 112 provide even lower levels of 
video compression processing power than Class 2 video 
encoders 110 with similar or higher levels of external control 
over their internal video compression processing. Class 3 
encoders, which are typically low-end software encoders, 
are suitable for non- time-critical (i.e., high latency) appli- 
cations with either high or low bandwidth requirements, 
such as web browsing and electronic mail. 

As shown in FIG. 1, video processing system 100 also has 
a multiplexer (mux) and traffic controller 114 (also referred 
to herein simply as the multiplexer), which controls the 
transmission of data from the compressed video bitstreams 
generated by the various video encoders over the shared 
communication channel 116, In addition, controller 114 uses 
information corresponding to the various compressed video 
bitstreams (generated by the various video encoders) to 
generate control signals that are transmitted back to one or 
more of the video encoders to adaptively control — at least at 
some level — the video compression processing performed 
by those video encoders. The information may include 
current frame rate, number of bits per frame, picture type, 
picture duration, picture capture time, and other statistics, 
such as scene change information, picture variance, motion- 
compensated-error variance, and mode statistics (e.g., num- 
ber of intra vs. inter macroblocks). Depending on the 
implementation, different types of information can be gen- 
erated and reported at frame level, slice level, or picture unit 
level. 

As indicated in FIG. 1, controller 114 generates two types 
of video compression control signals: (1) coarse control 
signals used to control video compression processing, e.g., 
at the frame level and (2) fine control signals used to control 
video compression processing at a finer level, e.g., at the 
sub -frame level. Controller 114 transmits specific coarse 
video compression control signals to any of the individual 
video encoders over a coarse control bus 118. In addition, 
controller 114 transmits specific fine video compression 
control signals to any individual video encoders (e.g., Class 
1 encoders 108 and Class 2 encoders 110) that provide finer 
external control (e.g., at the sub-frame level) over their 
internal video compression processing over a fine control 
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bus 120. Coarse video compression control signals corre- 
spond to relatively high-level control over video compres- 
sion processing and may include frame rates, target numbers 
of bits per frame, and/or average quantization levels over a 

5 frame. Fine video compression control signals, on the other 
hand, correspond to relatively low-level control over video 
compression processing and may include target numbers of 
bits per slice within a frame, average quantization levels per 
slice or even per rnacroblock within a frame. Other types of 

10 fine video compression control signals will be described 
later in this specification. 

In addition to information for each compressed video 
bitstream, controller 114 takes into account both bandwidth 
and latency requirements of the various corresponding video 

15 applications when performing both its traffic control and 
compression control functions. 

Video processing system 100 also has an off-line profiling 
tool 122, which analyzes, in non-real-time, typical sets of 
video sequences corresponding to different types of video 

20 applications and stores the results of those analyses in an 
application profiles server 124. The service admission man- 
ager 104 accesses information in the application profiles 
server 124 in order (1) to determine whether to admit a 
particular new video application and, if so, (2) to determine 

25 to which video encoder to assign the newly admitted video 
application. In addition, controller 114 also accesses infor- 
mation in the application profiles server 124 in order to (1) 
determine an acceptable level of buffering for at least one 
video application and (2) order packets of data from different 

30 video applications. Moreover, if there is profile information 
on the nominal MQUANTand MQUANT tolerance that can 
be used to encode a particular application, the controller can 
attempt to maintain this constraint on all the encoders. As 
another example, if region of interest information is 

35 available, and slice level MQUANT setting is possible, the 
controller can intelligently trade-off and change the 
MQUANT over a frame. Similar control for frame-rate and 
spatial resolution is also possible. 

4Q According to the embodiment shown in FIG. 1, video 
processing system 100 has one or more Class 1 encoders 
108, one or more Class 2 encoders 110, and one or more 
Class 3 encoders 112. It will be understood that, in alterna- 
tive implementations ox the present invention, video pro- 

45 cessing systems may have fewer or more different classes of 
encoders available, including those (hardware or software) 
encoders that provide no degree of external control over 
their internal video compression processing. With this latter 
class of "uncontrolled" encoders, the traffic controller pro- 

50 cesses the corresponding received compressed video bit- 
streams for transmission over the shared communication 
channel in an open-loop manner. Nevertheless, even in these 
situations, the traffic controller may be able to exercise some 
"post-processing" control by altering the bitstream before 

5S transmission by dropping frames or even replacing portions 
of frames such as slices or individual macroblocks with 
special skip codes. Since the encoders will be unaware of 
these changes, such post -processing control may adversely 
affect the quality of the video playback at the end users. 

60 Furthermore, as new and improved software encoders 
provide higher and higher levels of video compression 
processing power, not to mention greater and greater levels 
of external control, hardware encoders might not be needed 
at all in video processing system 100, even for high-end 

65 interactive video games. 

The main operations of video processing system 100 
correspond to three different generic functions: (1) off-line 
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application profiling for content classification (implemented 
by off-line profiling tool 122), (2) service admission pro- 
cessing (implemented by service admission manager 104), 
and (3) traffic and compression control (implemented by 
controller 114). Each of these three functions is described in 
further detail in the following sections. 

Off-Line Application Profiling for Content 
Classification 

As mentioned earlier, off-line profiling tool 122 analyzes, 
in non-real-time, typical sets of video sequences correspond- 
ing to different types of video applications and stores the 
results of those analyses in application profiles server 124. 
In a preferred implementation, the profiling is semi- 
automatic and each video application is characterized 
according to the following parameters: 

(a) Level of interactivity (related to latency tolerance); 

(b) Extent of frame -to -frame motion (both peak and 
average); 

(c) Encoding resource requirement (i.e., identification of 
acceptable classes of encoders) and the levels of exter- 
nal control offered by those encoders; 

(d) Type of graphics driver and ability to intercept the 
graphics commands; 

(e) Bit rates required (both peak and average) for accept- 
able quality. The peak can be obtained by performing 
I-frame-only encoding at an acceptable average frame - 
level quantization (M QUANT) level and picking its 
peak. The average bit rate can be obtained by IP-only 
encoding (i.e., no B frames) at the same M QUANT 
level. 

(f) Minimum frame rate required to achieve acceptable 
quality for the application. 

(g) Required spatial resolution determined by identifying 
the highest spatial frequency present (e.g., from quan- 
tized DCT coefficients) and characterizing how critical 
the high-frequency components are for the application. 

(h) Region of Interest (Rol): In many applications, espe- 
cially video games, the Rol can be bounded within a 
region. Knowledge of this can help the encoder as well 
as the multiplexer. 

(i) Objectionable artifacts: Some applications may be very 
sensitive to frame dropping, others may be sensitive to 
slice dropping, and still others may be sensitive to 
spatial adaptation of the quantizer. This profile will 
suggest the best overflow handling strategy at the 
multiplexer as well as the best way to control the 
encoder. 

After a sufficient number of video applications have been 
analyzed off-line according to the preceding parameters, 
profiling tool 122 processes the various results to make 
generalizations about groups of video applications based on 
their collective similarities and respective differences in 
order to generate rules used by video processing system 100 
in real-time processing of other video applications. Such 
profiling can be relatively simple, such as characterizing the 
level of interactivity of different video applications as either 
"high," "intermediate," or "low." Alternatively, more and 
more sophisticated schemes can be implemented. The result- 
ing profile information is stored in application profiles server 
124 for eventual use by service admission manager 104 for 
initial service admission as well as by controller 114 for 
traffic control and multiplexing. In addition, the service 
provider for a particular video application may be able to 
maintain user profiles which indicate the behavior of par- 
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ticular users (such as type of games played, levels reached, 
typical browsing patterns, etc.). This information might only 
be used as a second-order control, since there may be 
multiple users with access to a particular user node. 

5 

Service Admission Processing 

Service admission manager 104 determines the mix of the 
active applications at any given time. The main task of this 
tool will be to ensure that only services for which (a) the 
10 required encoder resources are available and (b) a minimum 
QoS can be guaranteed for the entire session, are admitted 
into a multiplex pool. The service admission decision is 
based on the profiles of the applications that are requested. 
In one possible implementation, the different video appli- 
15 cations are divided into the following classes: 

(CI) High -end video games having very stringent latency 
requirements, high motion, and high spatial 
complexity, requiring hardware encoders to achieve 
high bandwidth and low latency, even though there is 
20 little external control over the video compression pro- 
cessing; 

(C2) Low-end video games having moderate to high 
latency requirements and lower encoding complexity, 
25 that can be processed using high-end software encoders 
to achieve low latency; and 
(C3) Web browsing and e-mail applications with high 
latency requirements that can be processed using low- 
end software encoders. 
30 When a request is made to add a new application, service 
admission manager 104 obtains the following information 
from application profiles server 124: 

(1) Class of application (e.g., video game (high-end, 
intermediate, or low-end), web browsing, e-mail, etc.); 
35 (2) Interactivity of application (usually represented as 
latency requirement and classification in the profiles 
server) used in classifying the service, service 
- admission, assignment of resources, control of encoder, 
and traffic control; 
40 (3) Motion extent used to determine the frame rate 
required for the application, which is used by the 
controller to control the encoders. It can also be used 
for resource allocation to assign an encoder to the 
application; 
45 (4) Peak bandwidth required; and 
(5) Average bandwidth required. 
Based on this information, service admission manager 104 
will admit the new application, if and only if both of the 
50 following two rules would be satisfied after admitting the 
new application: 

(a) Sum of the peak bandwidths for all CI applications 
plus sum of the average bandwidths for all C2 appli- 
cations is less than the total channel bandwidth; and 
55 (b) Sum of average bit rates of all applications (i.e., CI, 
C2, and C3) is less than the total channel bandwidth; 
and 

(c) Encoding resources are available for the new appli- 
cation. The first rule is fairly conservative, and applies 

60 to relatively simple implementations of the present 
invention. For more sophisticated implementations in 
which controller 114 is provided a high degree of 
control (i.e., more fine control) over the video com- 
pression processing implemented within the various 

65 video encoders, the first rule can be relaxed. Such fine 
control may involve control of slice -level and even 
macroblock-level quantizers as well as the staggering 
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of intra frames across different applications (to ensure 
that a limited number of applications have intra frame 
within the same frame time). In that case, service 
admission manager 104 can use a more complicated 
formula depending on the QoS requirements of the 
various video applications and take further advantage 
of the statistical nature of video streams. Thus, more 
applications across the various types may be able to be 
admitted, as compared to the above solution, which is 
constrained based on the peak bandwidths of the CI 
applications. Note that the motion extent and interac- 
tivity can also be used to allocate encoding resources to 
application. 

An alternative call admission strategy would be to replace 
the stringent first condition by: Maximum of the sum of 
the peak bandwidths of concurrent I frames possible at 
a time based on the GOP structures for CI 
applications* the sum of the average bandwidths of the 
remaining applications is less than the total channel 
bandwidth. 

Such a policy would allow more CI applications. However, 
it should be noted that the probability of not meeting the 
minimum QoS at a given time instant increases as the 
number of active applications increase. 

GOP Structure and Big Picture Handling 

In one implementation of video processing system 100, 
low latency applications are assigned to video encoders that 
use only short GOP structures having only I and P (and no 
B) frames, such as IPPP, where every fourth frame is an I 
frame. Using shorter GOP structures supports interactivity. 
However, since I frames appear so frequently, hardware 
encoders may be required for such applications. In any case, 
the GOP period should be less than two seconds to handle 
errors as well as to allow decoder ^synchronization when 
the user flips through channels. For some software encoders 
that provide a high degree of external control, an adaptive 
intra-refresh strategy can be used to avoid having to send I 
frames so frequently. Instead, different parts of each frame 
are intra-refreshed in different P pictures over a period 
corresponding to a chosen GOP size. 

Traffic and Compression Control 

Multiplexer and traffic controller 114 handles the follow- 
ing tasks: 

(a) advance bit allocation for each video encoder based on 
the spatial and temporal quality desired for the corre- 
sponding application, 

(b) multiplexing the different bitstreams while meeting 
the latency requirements of each application, and 

(c) handling the pathological cases in such a way to 
minimize noticeable QoS degradation and to commu- 
nicate the handling strategy to the controllable encod- 
ers. 

Due to the varying degrees of control available at the 
different encoders, the bit allocation and buffer control range 
from a mere frame-level interaction between controller 114 
and each encoder to finer levels, such as at the slice- or even 
macroblock-level. In addition, the fact that the different 
applications are not frame synchronized can be exploited to 
provide frame- (or finer) level control of other services, 
while responding to an unexpectedly high instantaneous bit 
rate from a particular service. In other words, the individual 
encoders can be staggered with respect to one another over 
the frame time to allow controller 114 to control the com- 
pression processing for certain applications based on the 
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results of compression processing for other applications that 
fall later within the same frame time. 

For one implementation of video processing system 100, 
the impact of the varying degrees of control and the varying 
QoS requirements for each class are briefly summarized 
below: 

Class CI applications: These are encoded using hardware 
encoders that may provide external control over only 
the specification of frame-level target number of bits 
and average MQUANT over the frame. 

Class C2 applications: These are games that are software 
encoded and do not take a very large bandwidth. The 
applications are encoded without B frames using GOP 
structures in which I frames may be encoded at rela- 
tively large intervals. Implementing an adaptive mac- 
roblock refresh strategy that will intra-code a fraction 
of the macrob locks in every P frame can support 
switching back and forth between applications while 
containing error propagation as well. This will smooth 
out the bit profile. Any variations will come from 
content and not from the GOP structure and picture 
types. Note that Class C2 applications require low 
latency encoding/multiplexing. Controller 114 acts as a 
video rate controller and controls the picture type, rate, 
etc. The control is hierarchical: at one level, picture 
type and frame-level targets are controlled; at another 
level, slice -level targets are controlled. The adaptive 
refresh strategy is also staggered across the different 
mid-range encoders and are scheduled to coincide with 
the valleys between the peaks of the Class CI appli- 
cations whenever possible. 

Class C3 applications: It is assumed that web browsing 
and email applications have virtually no QoS require- 
ments compared to Class CI and C2 applications. Class 
C3 applications can be scheduled in the gaps and 
valleys of the bit profiles of the other services, so as to 
increase channel utilization. Hence, their latencies can 
be quite high (of the order of several frame times). For 
more sophisticate encoding and multiplexing 
strategies, a dynamic QoS for these services can be 
determined on the fly and bandwidth allocation pro- 
portional to this dynamic QoS can be made. 

Advance Bit- allocation to Various Sources 

Advance bit allocation refers to allocation of a fraction of 
the instantaneous bandwidth to each encoder based on its 
past statistics without actually knowing the actual complex- 
ity of the current frame. This is important for applications 
having low-latency requirements, which preclude look- 
ahead based bit allocation. The advance bit allocation for 
each encoder is implemented based on: 

(a) the minimum spatial quality setting needed for the 
corresponding application; 

(b) the complexity and average MQUANT for the previ- 
ous frame of the same picture type; and 

(c) the encoder buffer fullness. 

In addition, the control can also decide to skip frames based 
on the quality requirements. 

Since the applications are not synchronized at the frame 
level, a frame-level target is computed for the encoder that 
will start encoding a frame next (at any given time), based 
on the average MQUANT chosen for that encoder. Using a 
rate-distortion model linking bit consumption, average 
MQUANT, and motion compensated distortion, and enforc- 
ing constraints on MQUANT, the bit count for a frame can 
be estimated from prior data. An example of the constraint 
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on M QUANT can be that the quality is uniform across the Channel Bandwidth Allocation — Embodiment #2 

applications, while ensuring that the temporal rate of change . . t . £ 1f £1 . , . r , 

of average MQUANT is within a tolerance threshold. The , Assume that the following profile is available for each 

channel bit rate is divided between the applications accord- frame ( or data umt > of lhe source: 

ing to their respective complexities and relative significance. s (1) Lnom (Nominal Latency): This is the latency up to 

The complexities are updated on the fly, and the relative which the user will not perceive any appreciable 

significance can be obtained from the results of off-line decrease in quality; and 

profiling stored in application profiles server 124. ( 2 ) Lmax (Maximum Latency): This is the latency above 

For the less controllable encoders, only the frame-level which quality is completely unacceptable to the user, 

target (or average MQUANT) might be able to be commu- 1Q M such) if lafc ^ exceed tniSj the frame might as 

nicated to the encoder. For the more controllable encoders, we jj ^ dropped 

the basic unit of operation will be a slice (e.g., a row of piG 2 shows ^ assumed iccewisc ]inear cost 

macroblocks). Because the encoders are not synchronized, , , . . ^ . . r4w , • . rtf 

.« • . « cc - \ en 1- based on latency. This is the quality measure in terms ot 
this will require a worst-case buffer requirement or 2 slices. , A c f *l . -n u j 1 w \ 
A slice-level target is computed for each controllable iateQ ^ for a framc ^ £ used in s atisucal mult^, ex- 
encoder based on the frame target, the buffer fullness for that « costs Ca and 05 in 2 are obtained from off " line 
encoder (which is indicative of the buffer delay), and the profiling. 

instantaneous bit rate available after deducting the bits F° r the CODtro1 svstem > the following variables are 

(within a latency window) from the less controllable encod- described. Assume mat the current time is Tcurr, and let the 

ers. The slice targets are also constrained by the fact that time for encoding a frame of encoder i be Tfi. 

MQUANTs cannot change too much within a frame. 20 Definitions 

For Class C3 applications, a one-frame bit buffer is used. 

In other words, the encoders encode a new frame only after ^ ate Q f system 
all the bits for the frame that was encoded before the last 

frame have been transmitted by controller 114. This The state of the system is described by a set of vectors, 

on-demand encoding eliminates the possibility of conges- 25 Pij«{Nij=number of bits in frame j of encoder i, Tij=time 

tion due to Class C3 services. Other strategies to tune the s P ent b y frame 50 far m Physical multiplexer (PM) buffer}, 

encoding to suit the application's demands are discussed in where 1=0 > 1, 2, . . . N, where N=number of encoders, j runs 

the following section. over frames in PM buffer for encoder l 

Channel Bandwidth Allocation — Embodiment #1 3Q Input Measurements 

Channel bandwidth allocation is different from the instan- [ n me CO ntrol system, the following measurement data is 
taneous bit rate from each encoder because of the mux buffer received from the encoders: 
in controller 114. A certain amount of mux buffering is picture capture time; 
needed to prevent the individual rate controllers from enter- 
ing into an oscillatory mode, constantly correcting the 35 " icture type; 
allocation and ending up with a highly varying spatial Picture duration; 

quality across a frame. However, the statistical multiplexing Average MQUANT used to encode the picture; 

gain tends to be higher as multiplex is performed at a finer Number of bits used to encode the picture; 

level. Hence, the actual amount of buffering has to be chosen Advanced statistics such as macroblock variance and 

carefully. The exact amount of buffering at controller 114 for 40 othcr macrob lock activity measures; 

particular applications depends on their latency require- « - , , 

»' it ... . r i_ ji- *i_ 1 ■ 1 Whether the picture corresponds to a scene change and 

ments and the strategies used for handling pathological . r . , £ , ! , 

cases. The channel bandwidth allocation step implemented Simuar . information for different groups of macroblocks 

by controller 114 ensures that the latency requirement for _ withui a picture _ 

each application is met. For example, up to 10-ms latencies 45 ™ e Jl oU f ctl0 ? of s " ch information over an interval {Tcurr- 

can be allowed for the multiplexing delay for Class 1 and 2 M * Tfi > Tcurr >> 1S denoted « Ml J for a11 the frames m that 

encoders. Alternatively, mux buffering can be tailored based interval. 

on actual data. Output Measurements 

The allocation decisions for all applications are made at 
the slice level. After all the bits for a slice in each encoder 50 Output measurements are derived from input measure- 
arrive at controller 114, the allocation is made based on the ments and from the state of the svstem - Essentially this 
buffer fullness and the latency requirement for the applica- measurement is the latency of a frame Lij={Tij when the last 
tion. This can be done in two steps: (1) each application in bit ieaves buffer ) and { s P atial W&iy measured by average 
Classes CI and C2 is allocated a bandwidth that is the MQUANT}. The controller attempts to control and mini- 
minimum of the buffer occupancy and the slice-level bit rate 55 mize tnese costs * 

used by service admission manager 104, and (2) the remain- Jraffic Control and Mocation of ChaDnel Band . 

ing bandwidth, if any, is then distributed among all the width ^ ^ Sourceg 
applications, in turn, to meet their latency requirements. 

Class CI applications take precedence over Class C2, and Each encoder has Mi frames in the buffer, some of which 

Class C2 takes precedence over Class C3. Hence, Class C3 60 may be partial frames. Let bij be the bits transmitted from 

bits are transmitted only when bits remain in an allocation each frame of each encoder i. The problem is then to allocate 

after the latency requirement for Classes CI and C2 are met. bij such that 2bij<=Bagg, while ensuring that the frame 

The buffer occupancy is maintained below the maximum latency is met. The following iterative procedure provides 

allowed buffer delay for a service during normal operation. this: 

The exceptions (i.e., when the requirement for Classes CI 65 (1) Initialize bij. If Bagg is the total bits available, bij is 

and C2 cannot be met) are handled under the pathological chosen to be proportional to Cost (time_jspent_so_ 

cases. far). 
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(2) Given bij, calculate the expected frame latency aij= 
Expected Value {frame_latency|bij, Pij, Mij }. This is 
a modeling problem that estimates the time spent by 
frame ij in the physical multiplexer, given the current 
state and current measurements of the system, and the 5 
current allocation. This is accomplished by simulating 
the action of the physical MUX over the next few 
time-grains (until the frame is transmitted). This 
involves prediction of future values of bij, which can 
use the same formula as the initialization step 1. 10 

(3) Update bij in proportion to the expected latencies of 
the sources. 

(4) Repeat Steps 2 and 3 until convergence when bij is 
stable, i.e., does not change by a large amount. A 
formula ||Abij||<x*||bij|| is used, where x is nominally 15 
10%. 

Congestion Control 

Good service admission procedures can reduce the num- 
ber of pathological cases for hardware encoders. Still, patho- 20 
logical cases will happen due to the fact that profiling cannot 
provide accurate slice peaks. Small deviations in latency 
requirements can be relaxed, hoping that the rest of the 
frame will not be equally hard to code. Controller 114 may 
drop packets, but then processing cannot recover till the next 25 
I frame. If picture types can be requested, then controller 114 
can request an I frame from the encoder after dropping 
packets. If picture type cannot be dictated, it may be 
preferable to delay the frames instead of allowing packet 
dropping. Then, et the next I frame, the buffer can be flushed 30 
thereby dropping packets right before the I frame and 
resynchronization can then be established with the I frame. 

For software encoders, the tighter control explained 
before will significantly reduce catastrophic breakdowns. 35 
However, in case it occurs, controller 14 drops slices and 
communicates that information back to the encoder. The 
encoder can keep track of the decoder state. A good strategy 
at the multiplexer is to drop the whole slice, and instead send 
a slice with all skipped macroblocks instead. If the encoder 4Q 
knows this information, it can refresh these macroblocks so 
that the decoder can recover. Alternatively, the encoder may 
have the ability to save a previous reference frame. In that 
case, when controller 114 drops a P frame or even just a slice 
of a P frame, it can inform the encoder so that the encoder 45 
will use the previous P frame for subsequent encoding, 
thereby avoiding prediction errors between the encoder and 
decoder. 

Encoder Optimizations and Tuning For Low- 
Latency Applications 50 

The overall system latency is the sum of the latencies 
introduced by the following components: 

(1) Decoder Latency: At worst case, this is a delay of 
2-frame duration, including the decoding delay and the 5S 
display delay. A higher frame rate will lead to reduced 
decoder latency. For frame pictures, this delay will be 
66 ms for 30 frames per second (fps). This latency can 
be reduced by up to 16.5 ms by using field pictures, 
instead of frame pictures. 60 

(2) Encoder latency: The encoder is assumed to be rea- 
sonably pipelined and the delay is assumed to be about 
40% of frame delay. In that case, the delay is roughly 
15 ms for a 30-fps transmission. Further computational 
pipelining of the encoder can reduce this number. 

(3) Mux buffering at controller 114: This is a buffering 
delay that can be used for rate control. It is expected to 
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have about 5-10 ms of buffering that can be used for 
this purpose. Of these latencies, it is assumed that only 
the encoder and mux buffer latencies can be controlled. 
Increased buffering latency at controller 114 is desir- 
able from the rate-control point of view since it gives 
more time for controller 114 and the encoder to respond 
to changing traffic conditions. It is assumed that the 
latency at the decoder cannot be controlled, although 
this knowledge can be used to design coding modes 
that reduce this latency. 
The latency estimates for video processing system 100 
total less than 100 ms. 

Strategies for Reducing Latency 

As latency is reduced in specific components, greater 
ability is obtained to fine-tune the encoder and adapt to 
changing content and traffic conditions using some or all of 
the following strategies: 

Simple Profile Encoding: Since B pictures lead to 
re-ordering delays, in order to maintain low latency, 
encoding is performed with only I and P pictures. In 
addition, using dual -prime motion vectors can result in 
improved compression efficiency for IF-only encoding. 

Pipelining the encoder: Computational pipelining refers 
to performing all the encoding tasks on a minimum unit 
of encoding, e.g., macroblock, slice. Typical hardware 
encoders use hierarchical motion search and cannot be 
pipelined entirely. On the other hand, in software 
encoders, the hierarchical motion estimator can be 
tailored to start a slice-level pipeline after 3 rows of 
macroblocks are available. 

Field pictures: One possibility is to perform field-picture 
encoding (even though material is progressive). The 
decoding delay will only be one field interval and this 
will save V4 frame interval in decoding delay. The 
encoding algorithm would have to be tailored for this 
coding mode. The fields can either be from the same 
progressive frames at 30 frames/sec in which case the 
top and bottom fields are at the same time instant, or 
they can come by sampling at 60 frames/sec and 
throwing away alternate fields. The latter solution may 
better match the interlaced display in the home. In both 
cases, special preprocessing may then become neces- 
sary. The algorithms can be tailored to enable good 
quality while using this field-picture mode. 

Algorithmic Improvements for Game/Web Content 
Encoding 

In addition to the above-mentioned low later cy 
improvements, a number of other possible improvements 
can be implemented to improve the coding performance, as 
well as reduce latencies for graphics and web content. 

Pre-Encoding of Static Portions of Web/Email 
Browsers 

If browser signals were intercepted, it would be possible 
to pre-encode the various options and pop-up menus. This 
can lead to better I-frame coding of the static portions and 
so will require fewer bits subsequently. The constancy in the 
quality of the browser menus and icons will improve the 
perceptual quality considerably. The encoding latency will 
be reduced, though this is not a major issue in these 
applications. However, the savings in cycles could be sig- 
nificant enough to allow more web/email users to be admit- 
ted at the same time. 
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Region-of -Interest Encoding 

Many games have specific regions of interest that are of 
more importance to the player. For example, most games 
have a center-weighted region of attention. This can be 
exploited in the bit-allocation strategy within a flame. 
Furthermore, it can also be used for intelligent packet- 
dropping at controller 114 when buffer or latency require- 
ments re not met. 

Encoder Parameter Tuning 

The following encoding parameters can be tuned to 
improve the compression efficiency for game/web content. 
Note that hardware encoders are usually tuned for natural 
video scenes and hence might not perform as well on 
graphics and text content. 

(a) Rate control initialization: A careful initialization of 
the rate control to match the multiplexer operation as 
well as the GOP structure can provide substantial 
improvements in quality. 

(b) Quantizer matrix selection: The quantizer matrices 
commonly used are tailored to natural video. Matrices 
can be developed that are tailored to graphics and text. 

(c) Perceptually adaptive quantization: In MPEG-2 
encoding, the complexity or activity of a block is used 
for perceptually adaptive quantization. These compu- 
tations should be modified for graphics and text 
content, and different measures of activity and distor- 
tion should be used. 

(d) Pre-processing: The final output display device is 
• interlaced, even though the encoded material is pro- 
gressive. Further, field picture coding modes , re pro- 
posed to reduce latency. Thus, suitable pre-processing 
by vertical filtering, etc. is essential for good display 
quality. 

(e) Low-latency scene change detection: If scene changes 
are quickly detected, controller 114 can be provided 
with this information to allow it to respond by changing 
the allocations for various applications and perhaps 
postponing intra frames on other channels whenever 
possible. 

(f) Encoding complexity estimation: Rate-distortion mod- 
els enable prediction of encoding complexities for a 
frame based on distortion parameters. These models 
will be useful for the advance allocation statistical 
multiplexer. However, the models have mostly been 
developed for natural video and need modifications for 
game and web content. 

Distributed Intra-Refresh Strategies 

A large amount of application bit-rate fluctuations come 
from changes in picture types with I frames typically using 
more bits than P frames. This fluctuation can be reduced by 
distributing the intra-coding of macroblocks over a number 
of P frames. In the absence of scene changes, this strategy 
can yield a relatively smooth bit-rate profile. This choice can 
easily be implemented on software encoders, but not on 
hardware encoders. 

Motion Estimation Complexity Reduction 

In text browsing application, motion is typically very 
even and translational across a region of the image. This 
assumption can be used to reduce the complexity of motion 
estimation. For example, within a row, motion estimation 
could be performed on a subset of the macroblocks and if the 
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motion is determined to be similar, the same motion vector 
can be used for the other macroblocks. 

Motion estimation complexity can be reduced by exploit- 
ing the knowledge about the graphics commands. Inter- 
s cepted graphics commands can be used to quickly and 
accurately estimate motion without going through the com- 
plete search process. Again, this may lead to significant 
computational savings. 

10 Dynamic Frame Rate Selection and Spatial 

Resolution Change 

The frame rate can be dynamically adjusted based on the 
content and the state of controller 114. In cases where the 
channel is overloaded, frame rates could be reduced to 

35 maintain acceptable spatial quality. Note that this solution 
will mainly work for intermediate- to low- interactivity 
applications. Another innovation would involve dynamic 
changes is spatial resolution (to half-horizontal, for 
example), whenever the content is less detailed or whenever 

20 channel constraints so dictate. In MPEG-2 encoding, this is 
done at the GOP level, rather than at the picture level. 
However, this is abetter response to channel congestion than 
the catastrophic case handling described in the previous 
section. 

25 

Dynamic GOP Structure 

The GOP structure can be limited to a relatively simple 
structure consisting of an I frame followed by a number of 
consecutive P frames. The frequency of I frames can be 
dynamically adjusted by controller 114 across the encoders 
in order to stagger the I frames to take advantage of 
statistical multiplexing gains. In many cases, due to scene 
changes, an encoder might start I-frame encoding at 
35 instances when it was not scheduled. In those cases, con- 
troller 114 should delay and reschedule I frames for the other 
encoders in order to maintain QoS across different applica- 
tions. 

Miscellaneous Features 

40 # 

In addition, depending on the implementation, controller 
114 may be able to perform one or more of the following 
miscellaneous features: 

Scheduling I frames based on advance knowledge 

45 acquired from the application. In general, controller 
114 uses advance knowledge from a video application 
to control the encoding process for that application. 
One method is when an application like a web -browser 
can anticipate a scene change when a user clicks a new 

50 page, and inform controller 114. Controller 114 can 
anticipate a large bit rate for the frame and use it to 
control the compression processing of the other video 
applications as well as this particular application. For 
example, any scheduled I frames of the other applica- 

5S tions can be switched to adaptive refresh mode. 

Use of adaptive intra-refresh for handling scene changes. 
This can include the use of intra-macroblocks in the 
region of interest as a means of control when a scene 
change has occurred. 

60 In case controller 114 cannot match the latency require- 
ment for a particular video application, it sends a signal 
back to the application delaying the application. Thus, 
the application knows that the user has not been given 
a chance to respond and thus pauses. This is useful in 

65 high-interactivity services like video games. This delay 
can be achieved by using the pause command available 
on many applications. 
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Use of region-of-interest (ROI) information by controller 
114. One way is for the encoder to send priority 
information on groups of macroblocks. Controller 114 
then drops the low-priority regions in case of conges- 
tion. In addition or alternatively, controller 114 uses 
pre -encoded portions of the bitstream and does some 
bit-stream manipulation. This can be used in web- 
browsing and for backgrounds of games. In particular, 
the pre -encoded portions will be used for sections 
outside the ROIs, as a special method for handling 
ROI-based control 

Summary 

The proposed statistical multiplexer tools offer the fol- 
lowing advantages over other off-the-shelf multiplexers: 

1. Exploiting the varying QoS requirements to improve 
channel utilization while providing an acceptable qual- 
ity for all applications; 

2. Reacting to the less controllable encoders by exercising 
rate control measures on the more controllable software 
encoders; 

3. Taking advantage of the knowledge about the software 
encoder to improve perceptual quality; 

4. Achieving low latency through advance allocation of 
bit budget and through proper buffer management at the 
multiplexer; 

5. Making frame-level bit allocation proportion al to 
content complexity; and 

6. Performing graceful degradation of quality during 
congestion through better understanding of the effect of 
packet dropping from profiling and by effectively com- 
municating with the controllable encoders. 

Channel Surfing 

In some cases, a user may decide to keep his initial 
application running on one channel while surfing other 
channels in order to return to the initial application. Or, he 
may run two sessions in parallel and switch between ses- 
sions. These cases should be handled effectively, including 
taking advantage of these situations to reduce transmission 
bandwidth. For example, after detecting that the user has 
moved to another channel (e.g., based on monitoring the 
return path and the content served), a low-bit-rate slide show 
(e.g., I frames spaced relatively far apart) can be sent for 
decoder ^synchronization when the user comes back to the 
original interactive application. If the slide show lasts longer 
than a certain timeout period, the user's session can be 
automatically terminated. An alternative can be to save the 
game for later resumption. 

Possible System Architecture 

Low-delay MPEG2 video/audio encoding and statistical 
multiplexing are key technical requirements for many Digi- 
tal Television (DTV) and digital cable TV applications. In a 
conventional low-cost PCI (Peripheral Component 
Interconnect) bus-based computer system, significant pro- 
cessing delays are contributed by the system control, pro- 
gram layer PES (Packetized Elementary Stream) and trans- 
port TS (Transport Stream) multiplexing, and the PCI bus. In 
particular, the PCI bus delay will introduce uncertain delays 
based on the PCI-BIOS (PCI Basic Input/Output System and 
the Windows™ operating system from Microsoft Corpora- 
tion of Redmond, Wash. 

Computer systems in accordance with the present inven- 
tion avoid PCI bus delay by using the built-in multi-channel 
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Synchronized Serial Interface (SSI) ports of multiple Digital 
Signal Processors (DSPs), where each DSP performs video 
and audio encoder control, PES/TS layer multiplexing, and 
computation of statistical measurements of its correspond- 

5 ing video stream payload. The DSPs' on-chip memories may 
also eliminate the need for bitstream First-In, First -Out 
(FIFO) chips and some common SDRAM (Synchronized 
Dynamic Random Access Memory) chips. 

FIG. 3 shows a system-level block diagram of computer 

l° system 300, according to one embodiment of the present 
invention. Computer system 300 is a PCI bus-based indus- 
trial PC (Personal Computer) enclosure with multiple PCI 
boards. In particular, computer system 300 comprises a PCI 
bus 302 configured with a Central Processing Unit (CPU) 

35 board 304, up to n=24 encoder boards 306, and a statistical 
multiplexing (stat-mux) board 308. Although computer sys- 
tem 300 relies on a PCI bus, it will be understood that any 
other suitable system bus could be used in alternative 
embodiments of the present invention. 

20 CPU board 304 is a conventional industrial PC mother- 
board having a suitable central processor, such as an Intel 
Pentium III™ microprocessor by Intel Corporation of Santa 
Clara, Calif. In addition, CPU board 304 has a conventional 
PCI interface 310, an ISA (Industry Standards Association) 

25 bus interface 312, RS232 ports 314, a (e.g., 100-MHz) Local 
Area Network (LAN) interface 316, a hard disk/floppy disk 
(HD/FD) controller, and other standard PC periphery inter- 
faces. Software (e.g., in the "C" programming language) 
implemented by the Pentium processor may provide main 

30 system controls, fault-tolerant controls, and/or statistical 
multiplexing of those bitstreams that do not have low- 
latency requirements. 

Each encoder board 306 is an integrated video/audio 

35 encoder with an SDI (Serial Digital Interface or Serial DI) 
or ASI (Asynchronous Serial Interface) input port 318, a 
video encoder 320, an audio encoder 322, a PCI bus inter- 
face 324, and a DSP controller 326 (with an SSI port 328) 
for board-level sub -system control and low-delay PES/TS 

40 multiplexing plus bitstream statistics parameter measure- 
ment. 

Stat-mux board 308 has a PCI bus interface 330 and four 
DSP chips 332, where each DSP chip 332 has a six-channel 
SSI DMA (Direct Memory Address) 334 with six SSI ports" 

45 336, SRAMs 338, two DSP cores 340, and an ASI/TAXI™ 
chip set from Advanced Micro Devices, Inc., of Sunnyvale, 
Calif., and, in block 342, a DHEI (Digital High-speed 
Expansion Interface) I/O port from General Instrument 
Corporation (GI) of Horsham, Pa., for GI's modulator and 

50 CA (Conditional Access) equipment. As such, stat-mux 
board 308 can support up to 24 channels of low-delay 
MPEG2 video/audio input bitstreams. 

PCI bus 302 is used for power supply and system control 
for each PCI board. A DSP chip on each encoder board 306 

55 will directly transfer low-delay MPEG2 bitstreams to a 
corresponding DSP on stat-mux board 308. In particular, 
each low-delay MPEG2 video/audio bitstream will be 
directly transmitted from the SSI port 328 of the correspond- 
ing encoder board 306 to an SSI port 336 on stat-mux board 

60 3 08. The associated delay can be controlled to correspond to 
as few as four transport packet delays, with a two-packet 
delay in the encoder DSP 326, a one-packet delay at an input 
port 336 of stat-mux board 308, and a one-packet delay at an 
output port 342 of stat-mux board 308. In addition, PCI bus 

65 302 can be used to transmit additional MPEG2 video/audio 
bitstreams that do not have low-latency requirements. 
Depending on the implementation, these high-latency bit- 
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streams may be generated by video/audio encoders imple- 
mented in software within the central processor on CPU 
board 304. 

FIG. 4 shows a board-level block diagram of each encoder 
board 306 of computer system 300 of FIG. 3, according to 
one embodiment of the present invention. Encoder board 
306 comprises an internal board bus 402 configured with an 
input interface module 318 , an MPEG2 video encoder 320, 
an AC3 or MP3 audio encoder module 322, a DSP controller 
326 with PESyTS-layer multiplexing firmware, and 27-MHz 
SCR/PCR circuits 408, where SCR is the System Clock 
Reference in an MPEG video decoder and PCR is the 
Program Clock Reference in an MPEG transport decoder. 

Input interface module 318 can support both SDI and ASI 
circuits with a 270-MHz or 180-MHz line-coded clock, 
respectively. The SDI or ASI signals can be customized to 
interlace the uncompressed digital video data and multi- 
channel audio data. There is CPLD (Complex Program- 
mable Logic Device) or FPGA (Field-Programmable Gate 
Array) based deframing firmware to split the video and 
audio data, and to reproduce the video synchronization 
signals for the MPEG2 video encoder chip. 

MPEG2 video encoder 320 can be any suitable single- 
chip encoder, such as those supplied by IBM, C-Cube, or 
Philips, with supporting SDRAM, SRAM, and/or flash 
memories 404 and necessary glue logic circuits. The glue 
logic can be combined within the input CPLD firmware. 
There are also some downloadable micro -codes from the 
MPEG2 chip manufacturer. 

Audio encoder 322 can be any suitable off-shelf DSP- 
based sub-system that can support either the AC3 or MP3 
encoding function depending on the DSP software. If a 
TMS320c5410 DSP chip from Texas Instruments Incorpo- 
rated of Dallas, Tex. is used, then the audio encoding 
functions of audio encoder 322 can be combined with DSP 
controller 326, shared memories 406, and the PES/TS mul- 
tiplexing firmware for less board area and lower integration 
costs. 

Alternatively, DSP 326 may be a TMS320c5402 DSP 
from Texas Instruments. DSP 326 will provide of video 
encoder control, audio encoder control, the SCR/PCR time- 
base controls, and the overall board-level controls. It will 
also perform the PES/TS multiplexing of compressed video 
and audio bitstreams, and the statistical parameter measure- 
ments of the video stream. It will also execute the commands 
of statistical multiplexing controls received from PCI bus 
302 of FIG. 3. 

DSP on-chip SSI output port 328 can be directly con- 
nected to an SSI input port of a DSP on stat-mux board 308 
of FIG. 3. The on-chip DMA will automatically move data 
from the TS output buffer of on-chip memory to the serial 
output port. The TMS320c5410 DSP has 128 Kbytes of 
on-chip memory and a DMA-controlled host interface port, 
such that external SRAM and FIFO devices may be elimi- 
nated. For example, when video encoder 320 is an IBM39 
MPEGS422 video encoder chip, the video encoder can 
directly write its compressed video data into the 
TMS320c5410 on-chip SRAM with a simple CPLD to 
emulate the FIFO signals. The PESyTS MUX delay can be 
within transmitting two TS packets of video streams, such as 
2xl88x8*vide__rate delay. 

DSP on-chip timer 408 can also be programmed for the 
27 -MHz SCR/PCR time -base by incorporating on-chip PLL 
(Phase-Locked Loop) circuits. All of the 27-MHz clocks 
will be derived from the same 27-MHz clock on stat-mux 
board 303 through the clocks of the SSI ports connected to 
all of the encoder boards 306. 
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FIG. 5 shows a board-level block diagram of statistical 
multiplexing board 308 of computer system 300 of FIG. 3, 
according to one embodiment of the present invention. 
Stat-mux board 308 is a low-delay Input/Output (I/O) inter- 
face PCI board with the statistical multiplexing system and 
PCR time-base correction firmware. Stat-mux board 308 
comprises an internal sub-system bus 502 configured with 
four Texas Instruments TMS320c5420 DSP chips 332, each 
having six SSI serial ports 336 and 512 Kbytes of on-chip 
SRAM memory 338, such that stat-mux board 308 can 
receive up to 24 different channels of transport bitstreams. 

Each SSI serial input port 336 has three wires carrying a 
clock signal (sclk), a data signal (sdat), and a frame signal. 
All 24 clock signals sclk should be configured as the input 
clock signals and connected to an on-board 27-MHz clock 
oscillator 504. 27-MHz clock 504 will also be used as the 
DSP clock, and on-chip PLL circuits will generate a 90-MHz 
DSP clock. In that case, on-chip timers can be used for the 
PCR time-base corrections. The frame signals will indicate 
whether or not the data signal sdat carries meaningful data. 
The data signals sdat are burst with a maximum rate of 27 
Mbps. The frame signals can also be programmed in a 
"multi -channel mode" to send multiple packets into assigned 
on-chip buffers for transmitting the individual encoders' 
statistical parameters. 

ASI interface 506 uses a TAXI transmitter chip with 
parallel interface from Advanced Micro Devices, such that 
there are FIFO and CPLD control circuits to handle the 
TAXI interface and ASI controls. A DHEI interface 508 
from GI will need additional PLL circuits to generate the 
output clock, if there is no available input clock signal from 
DHEI port 510. There are also the DHEI line drive chips for 
the proper bi-level output interface. 

Although the present invention has been described in the 
context of a computer system in which each of the central 
processing sub-system, the statistical multiplexing sub- 
system, and each encoding sub-system is implemented on a 
separate computer board of the computer system, the present 
invention is not so limited. In particular, two or more of the 
different sub -systems could be implemented on a single 
board. Alternatively or in addition, any of the sub-systems 
could be implemented on more than one board. The impor- 
tant characteristics of the present invention relate to how the 
various components of the different sub-systems communi- 
cate with one another, rather than where those components 
are physically located. 

Although the present invention has been described in the 
context of a system having a central processing sub-system, 
in addition to the statistical multiplexing sub-system and 
multiple encoding sub-systems, all of which are configured 
to a PCI bus, it will be understood that the present invention 
is not so limited. In particular, the present invention can also 
be implemented in computer systems in which there is no 
separate central processing sub-system, but where all of the 
centralized control functions are implemented in the DSPs 
of the statistical multiplexing sub-system. Moreover, such a 
computer system may be implemented with or without a 
system bus, such as a PCI bus. 

The present invention can be embodied in the form of 
methods and apparatuses for practicing those methods. The 
present invention can also be embodied in the form of 
program code embodied in tangible media, such as floppy 
diskettes, CD-ROMs, hard drives, or any other machine- 
readable storage medium, wherein, when the program code 
is loaded into and executed by a machine, such as a 
computer, the machine becomes an apparatus for practicing 
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the invention. The present invention can also be embodied in 
the form of program code, for example, whether stored in a 
storage medium, loaded into and/or executed by a machine, 
or transmitted over some transmission medium or carrier, 
such as over electrical wiring or cabling, through fiber 5 
optics, or via electromagnetic radiation, wherein, when the 
program code is loaded into and executed by a machine, 
such as a computer, the machine becomes an apparatus for 
practicing the invention. When implemented on a general- 
purpose processor, the program code segments combine JQ 
with the processor to provide a unique device that operates 
analogously to specific logic circuits. 

It will be further understood that various changes in the 
details, materials, and arrangements of the parts which have 
been described and illustrated in order to explain the nature 
of this invention may be made by those skilled in the art 15 
without departing from the principle and scope of the 
invention as expressed in the following claims. 

What is claimed is: 

1. A method for controlling transmission over a shared 
communication channel of multiple compressed video bit- 20 
streams generated by a plurality of video encoders and 
corresponding to multiple video applications, comprising 
the steps of: 

(a) receiving information for each compressed video ^ 
bitstream wherein at least two of the video applications 
have different latency requirements; 

(b) controlling the transmission of data from the multiple 
compressed video bitstreams over the shared commu- 
nication channel taking into account the information for 3Q 
each compressed video bitstream and the latency 
requirement of each corresponding video application; 
and 

(c) adaptively controlling compression processing of at 
least one of the video encoders taking into account the 35 
information for the corresponding compressed video 
bitstream and the latency requirement of the corre- 
sponding video application. 

2. The invention of claim 1, wherein at least two of the 
video applications have different bandwidth requirements 40 
and steps (b) and (c) are both implemented taking into 
account the bandwidth requirement of one or more of the 
video applications. 

3. The invention of claim 1, further comprising the step of 
(d) controlling admission of a new video application for 45 
transmission of a corresponding compressed video bitstream 
over the shared communication channel. 

4. The invention of claim 3, wherein step (d) comprises 
the steps of: 

(1) receiving a classification for the new video applica- 50 
tion; 

(2) accessing results from off-line profiling of typical 
video streams corresponding to the classification for the 
new video application; and 

(3) determining whether to admit the new video applica- 55 
tion based on the off-line profiling results. 

5. The invention of claim 4, wherein step (d) further 
comprises the step of (4) assigning the new video applica- 
tion to an appropriate one of a set of available video 
encoders, wherein at least two of the available video encod- 60 
ers have different video compression capabilities. 

6. The invention of claim 5, wherein the different video 
compression capabilities include different levels of external 
control over video compression processing in step (c). 

7. The invention of claim 5, wherein the different video 65 
compression capabilities include different levels of video 
compression processing power. 
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8. The invention of claim 4, wherein: 

each video applications is categorized as being either: 
a CI application having relatively high bandwidth and 

relatively low latency requirements; 
a C2 application having relatively intermediate band- 
width and relatively intermediate latency require- 
ments; or 

a C3 application having relatively high latency require- 
ments; and step (d)(3) comprises the step of admit- 
ting the new application if and only if both of the 
following two rules would be satisfied after admit- 
ting the new video application: 

(i) a sum of peak bandwidths for all CI applications+a 
sum of the average bandwidths for all C2 applica- 
tions is less than a total bandwidth of the shared 
communication channel; 

(ii) a sum of average bit rates of all applications is less 
than the total bandwidth of the shared communica- 
tion channel; and 

(iii) encoding resources are available for the new appli- 
cation. 

9. The invention of claim 4, wherein: 

each video applications is categorized as being either: 
a CI application having relatively high bandwidth and 

relatively low latency requirements; 
a C2 application having relatively intermediate band- 
width and relatively intermediate latency require- 
ments; or 

a C3 application having relatively high latency require- 
ments; and 

step (d)(3) comprises the step of admitting the new 
application if and only if both of the following two 
rules would be satisfied after admitting the new 
video application: 

(i) maximum of a sum of peak bandwidths of con- 
current I frames possible at a time based on GOP 
structures for CI applications+a sum of average 
bandwidths of all C2 and C3 applications is less 
than a total bandwidth of the shared communica- 
tion channel; 

(ii) a sum of average bit rates of all applications is 
less than the total bandwidth of the shared com- 
munication channel; and 

(iii) encoding resources are available for the new 
application. 

10. The invention of claim 1, wherein at least two of the 
video encoders have different video compression capabili- 
ties, 

U. The invention of claim 10, wherein the different video 
compression capabilities include different levels of external 
control over video compression processing in step (c). 

12. The invention of claim 10, wherein the different video 
compression capabilities include different levels of video 
compression processing power. 

13. The invention of claim 1, wherein the processing of at 
least two of the video encoders is staggered within a frame 
time and step (c) comprise the step of controlling the process 
of encoding at least one compressed video bitstream taking 
into account the information for at least one other com- 
pressed video bitstream earlier in the same frame time. 

14. The invention of claim 1, further comprising the step 
of (d) performing off-line profiling of typical video streams 
corresponding to different classifications of video applica- 
tions to generate profiling results for use during at least one 
of steps (b) and (c). 

15. The invention of claim 14, wherein step (d) comprises 
the step of characterizing a level of interactivity for each 
typical video stream. 
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16. The invention of claim 14, wherein step (d) comprises 
the step of characterizing a desired level of video compres- 
sion processing power for each typical video stream. 

17. The invention of claim 16, wherein step (d) further 
comprises the steps of (1) identifying a class of video 
encoders for each typical video stream based on the desired 
level of video compression processing power and (2) char- 
acterizing a level of external control provided by the iden- 
tified class of video encoders. 

18. The invention of claim 14, wherein the profiling 
results are used during step (b) to determine an acceptable 
level of buffering for at least one video application. 

19. The invention of claim 14, wherein the profiling 
results are used during step (b) to order packets of data from 
different video applications. 

20. The invention of claim 1, wherein step (b) comprises 
the step of dropping data from a compressed video bit- 
stream. 

21. The invention of claim 20, wherein step (c) comprises 
the step of instructing a corresponding video encoder to take 
into account the data dropping for subsequent compression 
processing. 

22. The invention of claim 21, wherein the corresponding 
video encoder retains a previous reference frame to take into 
account the data dropping during the subsequent compres- 
sion processing. 

23. The invention of claim 20, wherein step (b) further 
comprises the step of inserting skip codes into the com- 
pressed video bitstream in place of the dropped data. 

24. The invention of claim 1, wherein step (b) comprises 
the steps of: 

(1) delaying transmission of one or more frames from a 
compressed video bitstream during periods of high 
channel bandwidth usage; and 

(2) dropping one or more P frames before the next I frame 
to re-acquire a desirable latency level. 

25. The invention of claim 1, wherein step (c) comprises 
the step of encoding one or more frames on demand for a 
video application with a relatively high latency requirement. 

26. The invention of claim 1, wherein step (b) comprises 
the step of scheduling transmission of compressed data 
corresponding to a video application having a relatively high 
latency requirement to coincide with relatively low-bit-rate 
periods of one or more other video applications having a 
relatively low latency requirement. 

27. The invention of claim 26, wherein step (c) comprises 
the step of encoding frames on demand for a video appli- 
cation having a relatively high latency requirement to 
achieve the scheduling of step (b). 

28. The invention of claim 27, wherein step (c) further 
comprises the step of requesting frame types for one or more 
video applications. 

29. The invention of claim 1, wherein step (c) comprises 
the step of staggering I frames between different video 
applications over different frame times. 

30. The invention of claim 1, wherein step (c) comprises 
the step of instructing compression of a video application to 
include an adaptive refresh strategy in which intra slices are 
spread over multiple frames to reduce frequency of bit-rate 
peaks associated with I frames. 

31. The invention of claim 1, wherein the latency require- 
ment for a video application can vary over time and step (c) 
takes the varying latency requirement into account. 

32. The invention of claim 1, wherein step (c) comprises 
the step of performing advance bit allocation for one or more 
of the video applications. 

33. The invention of claim 1, wherein step (c) comprises 
the step of changing spatial resolution of a subsequent frame 
for compression processing of at least one of the video 
applications. 
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34. The invention of claim 1, wherein step (c) comprises 
the step of changing frame rate for compression processing 
of at least one of the video applications. 

35. The invention of claim 1, wherein step (c) comprises 
the step of controlling compression processing of at least 
one of the video applications based on advance information 
acquired from the video application. 

36. The invention of claim 1, further comprising the step 
of instructing at least one of the video applications to delay 
processing when the latency requirement for the video 
application is not met. 

37. The invention of claim 1, wherein step (b) comprises 
the step of transmitting data from at least one of the 
compressed video bitstreams based on region-of-interest 
information for the corresponding video application in order 
to prioritize data within a frame of the compressed video 
bitstream. 

38. The invention of claim 37, wherein step (b) comprises 
the step of transmitting pre -encoded data instead of the 
compressed video data for at least one region of the frame 
in the compressed video bitstream. 

39. A video processing system for controlling transmis- 
sion of multiple compressed video bitstreams corresponding 
to multiple video applications over a shared communication 
channel, comprising: 

(a) a plurality of video encoders, each configured to 
generate a different compressed video bitstream for a 
different video application, wherein at least two of the 
video applications have different latency requirements; 
and 

(b) a controller, configured to: 

(1) receive the compressed video bitstreams from the 
video encoders; 

(2) control the transmission of data from the multiple 
compressed video bitstreams over the shared com- 
munication channel taking into account information 
for each compressed video bitstream and the latency 
requirement of each corresponding video applica- 
tion; and 

(3) adaptively control the compression processing of at 
least one of the video encoders taking into account 
the information for the corresponding compressed 
video bitstream and the latency requirement of the 
corresponding video application. 

40. A controller for controlling transmission of multiple 
compressed video bitstreams corresponding to multiple 
video applications over a shared communication channel, in 
a video processing system further comprising a plurality of 
video encoders, each configured to generate a different 
compressed video bitstream for a different video application, 
wherein at least two of the video applications have different 
latency requirements, wherein the controller is configured 
to: 

(1) receive the compressed video bitstreams from the 
video encoders; 

(2) control the transmission of data from the multiple 
compressed video bitstreams over the shared commu- 
nication channel taking into account information for 
each compressed video bitstream and the latency 
requirement of each corresponding video application; 
and 

(3) adaptively control the compression processing of at 
least one of the video encoders taking into account the 
information for the corresponding compressed video 
bitstream and the latency requirement of the corre- 
sponding video application. 
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